Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH
نویسنده
چکیده
The true probability distribution of a test statistic is rarely known. Generally, its asymptotic law is used as approximation of the true law. If the sample size is not large enough, the asymptotic behavior of that statistic could lead to a poor approximation of the true one. Using bootstrap methods, under some regularity conditions, it is possible to obtain a more accurate approximation of the distribution of the test statistic. The bootstrap is a method to derive properties (standard errors, conïňĄ-dence intervals and critical values) of the sampling distribution of estimators. It takes the sample (the values of the independent and dependent random variables) as the population and the estimates of the sample as true values. Not to draw from a specified distribution by a random number generator, the bootstrap draws with replacement from the sample. The article will discuss many bootstrap methods and do simulations for some of them. 1 Definition 1.1 General illustration in Bootstrap World Consider a sample with n = 1, ..., N independent observations of a dependent variable y and M + 1 explanatory variables x. A paired bootstrap is obtained by independently drawing N pairs (xi, yi) from the observed sample withreplacement. The bootstrap sample has the same number of observations, however some observations appear several times and other observations never. The bootstrap involves drawing a large number B of bootstrap samples. a single bootstrap sample is denoted (x∗ b , y ∗ b), where x ∗ b is a N ∗ (M + 1) matrix and y ∗ b an N-dimensional column vector of the data in the b-th bootstrap sample. 1.2 Bootstrap Standard Errors The empirical standard deviation in a series of bootstrap replications of θ̂ can approximate the standard error se(θ̂) of an estimator θ̂. Here is the approach: • 1. Draw B independent bootstrap samples (x∗ b , y ∗ b) of size N from (x, y). Usually B = 100 replications are sufficient. • 2. Estimate the parameter θ of interest for each bootstrap sample: θ̂∗ b for b = 1, ..., B. • 3. Estimate se(θ̂) by ŝe = √ 1 B B ∑ b=1 (θ̂∗ b − θ̂ ∗)
منابع مشابه
$(varphi_1, varphi_2)$-variational principle
In this paper we prove that if $X $ is a Banach space, then for every lower semi-continuous bounded below function $f, $ there exists a $left(varphi_1, varphi_2right)$-convex function $g, $ with arbitrarily small norm, such that $f + g $ attains its strong minimum on $X. $ This result extends some of the well-known varitional principles as that of Ekeland [On the variational principle, J. Ma...
متن کاملInvestigating electrochemical drilling (ECD) using statistical and soft computing techniques
In the present study, five modeling approaches of RA, MLP, MNN, GFF, and CANFIS were applied so as to estimate the radial overcut values in electrochemical drilling process. For these models, four input variables, namely electrolyte concentration, voltage, initial machining gap, and tool feed rate, were selected. The developed models were evaluated in terms of their prediction capability with m...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملUsnig LR-Fuzzy Numbers Data to Measure the Eciency and the Malmquist Productivity Index in Data Envelopment Analysis , and Its Application in Insurance Organizations
In many real applications, the data of production processes can't be precisely measured.We develop some fuzzy versions of the classical DEA models (in particular, the CCRmodel) by using some ranking methods based on the comparison of cuts. Our approachescan be seen as an extension of the DEA methodology. The provides users and practitionerswith models which represent some real life processes mo...
متن کاملNew Approaches to Analyze Gasoline Rationing
In this paper, the relation among factors in the road transportation sector from March, 2005 to March, 2011 is analyzed. Most of the previous studies have economical point of view on gasoline consumption. Here, a new approach is proposed in which different data mining techniques are used to extract meaningful relations between the aforementioned factors. The main and dependent factor is gasolin...
متن کاملOn Model-Based Clustering, Classification, and Discriminant Analysis
The use of mixture models for clustering and classification has burgeoned into an important subfield of multivariate analysis. These approaches have been around for a half-century or so, with significant activity in the area over the past decade. The primary focus of this paper is to review work in model-based clustering, classification, and discriminant analysis, with particular attenti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013